Guidance

Amplicon Analysis at Microsynth

Objective: Amplicon analysis may be applied to various ends (in a non-resequencing context it may be metagenomics or any other inventory-type analysis).

Scope: Microsynth helps in this endeavour by sequencing and bioinformatics analysis.

The bioinformatics part of the analysis is described in the following.

Principal Workflow

Read quality filtering
Read denoising/clustering/dereplication
Downstream analysis

Principal Dataflow

Input: demultiplexed and quality filtered reads in fastq format for each sample
Enrichment: read denoising/clustering/dereplication and aggregation with supplementary data (e.g., taxonomies)
Output: distribution of clusters in samples and associated statistics

Data Formats

Specialized data formats are listed below (html, pdf, xlsx and txt are excluded).

gz is a data compression format (cmp. "zip") and may be decompressed with 7-Zip for instance
fasta/fastq format store sequence information and can be inspected using text editors (e.g., Notepad++)
tab/tsv format separates data columns by tabulated white space (open with a text editor or Excel)
biom format represents biological samples by observation contingency tables (open with a text editor or specilized software)
RData binary representation of R objects (use R to load and handle)
newick is a plain text format used here to describe phylogenetic trees of sequences (open with a text editor or specialized software)

Anatomy of Overview Page

Input Quality Assessment

This section is devoted to detailing key steps in quality control.

FastQC html files for sequencing quality assessment

Amplicon Analysis

Read cluster summaries and statistics
Downstream analysis (differential analysis; if ordered)
Functional profiling (if ordered)

Entire Analysis

The folder containing all generated files for the analysis